Thanks to Wix for supporting PBS Digital Studios.
Hey, I’m Jabril and welcome to Crash Course AI!
So far in this series, we’ve focused on artificial intelligence that uses Supervised
Learning.
These programs need a teacher to use labeled data to tell them “right” from “wrong.”
And we humans have places where supervised learning happens, like classrooms with teachers,
but that’s not the only way we learn.
We can also learn lots of things on our own by finding patterns in the world.
We can look at dogs and elephants and know they’re different animals without anyone
telling us.
Or we can even figure out the rules of a sport just by watching people play.
This kind of learning without a teacher is called Unsupervised Learning and, in some
cases, computers can do it too.
INTRO
The key difference between supervised and unsupervised learning is what we’re trying
to predict.
In supervised learning, we’re trying to build a model to predict an answer or label
provided by a teacher.
In unsupervised learning, instead of a teacher, the world around us is basically providing
training labels.
For example, if I freeze this video of a tennis ball RIGHT NOW, can you draw what could be
the next frame?
Unsupervised learning is about modeling the world by guessing like this, and it’s useful
because we don’t need labels provided by a teacher.
Babies do a lot of unsupervised learning by watching and imitating people, and we’d
like computers to be able to learn like this as well.
This lets us utilize lots of freely available data in the world or on the internet.
In many cases, one of the easiest ways to understand how AI can use unsupervised learning
is by doing it ourselves, so let’s look at a few photos of flowers with no labels.
The most basic way to model the world is to assume that it’s made up of distinct groups
of objects that share properties.
So, for example, how many types of flowers are here?
We could say there are two because there are two colors, purple and yellow.
Or we could look at the petal shapes, and divide them into round petals and tall vertical ones.
Or maybe we have some more experience with flowers and realize that two of these are
tulips, one is a sunflower, and one is a daisy, so there are three categories.
Immediately recognizing different properties like this and creating categories is called
unsupervised clustering.
We don’t have labels provided by a teacher, but we do have a key assumption about the
world that we’re modeling: certain objects are more similar to each other than others.
We can program computers to perform clustering too.
But to do that, we need to choose a few properties of flowers we’re interested in looking at,
like how we picked color or shape just now.
For a more realistic example, let’s say I bought a packet of iris seeds to plant in
my garden.
After the flowers bloom though, it looks like there were several species of irises mixed
up in that one packet.
Now I’m no expert gardener, but I can use some AI to help me analyze my garden.
To construct a model, we have to answer two key questions.
First, what observations can we measure?
All of these flowers are purple, so that’s probably not the best way to tell them apart.
But different irises seem to have different petal lengths and widths, which we can measure
and place on this graph with petal length on the Y axis and width on the X axis.
And second, how do we want to represent the world?
We’re going to stick to a very simple assumption here: there are clusters in our data.
Specifically, we’re going to say there are some number of groups called K clusters, but
we don’t know where they are.
To help us, we’re going to use the K-means clustering algorithm.
K-means clustering is a simple algorithm.
All it needs is a way to compare observations, a way to guess how many clusters exist in
the data, and a way to calculate averages for each cluster it predicts.
In particular, we want to calculate the mean by adding up all data points in a cluster
and dividing by the total number of points.
Remember, unsupervised learning is about modeling the world, so our algorithm will have two
steps:
First, our AI will predict.
What does the model expect the world to look like?
In other words, which flowers should be clustered together because they’re the same species?
Second, our AI will correct or learn.
The model will update its beliefs to agree with its observation of the world.
To start the process, we have to specify how many clusters the model should look for.
I’m guessing there are three clusters in the data, so that becomes the model’s initial
understanding of the world, and we’re looking for K=3 averages, or three types of irises.
But to start, our model doesn’t really know anything, so the averages are random and so
are its predictions.
Each datapoint (which is a flower) is given a label as type1, type2, or type3, based on
the algorithm’s beliefs.
Next, our model tries to correct itself.
The average of each cluster of datapoints should be in the middle, so the model corrects
itself by calculating new averages.
We can see those averages here, marked with Xs, which gives our updated model of the three
(or so we guessed) types of irises.
The graph is still pretty noisy.
For example, it’s a little weird that there are type2 flowers so close to the average
for type3.
But we did start with a random model, so we can’t expect too much accuracy.
Logically, we know that irises of the same species tend to have similar petals, so those
datapoints should be clustered together.
Since we just did a correction or learning step, we can repeat the process, starting
with a new prediction step.
Let’s predict new labels using the Xs that mark the averages of each label.
We’ll give every datapoint the label of its closest X -- type1, type2, or type3 -- and
then we’ll calculate new averages.
That’s better, but still not the cleanest clusters, so we can repeat the process again:
Predict, Learn, Predict, Learn.
Eventually, the Xs will stop moving and we have a model of iris clusters created with
unsupervised learning!
Now the ultimate question is, did we find meaningful patterns about the world with our
AI?
We made an assumption that there were three types of irises, and we assumed that they
have different petal lengths and widths.
Was this true?
Lucky for us, I have a friend who is a master gardener.
I showed him the real-life flowers closest to each of the three averages and he said
that type1 is Versicolor, type2 is Setosa and type3 is Virginica.
Three different iris species!
We learned about the world from observation, which is what makes this unsupervised learning,
even though we relied a tiny bit on a teacher(the master gardener) for confirmation and help.
Now that we’ve learned the basics, we can experiment with harder examples.
Let’s say we want to use an unsupervised learning algorithm to sort a bunch of different
photos, not just three iris species.
First, what observations can we measure?
How much green there is?
Whether there’s a nose and fur?
To have a computer make these observations, we need to measure thousands of red, green,
and blue pixels in each image.
Second, how do we want to represent the world?
Before, we were only working with 2 features, so we could just use averages of the clustered
datapoints and get meaningful abstraction from it.
But when dealing with images, we can’t use the same method, because we won’t get much
meaning out of averaging colored pixels for what we want to accomplish.
Somehow, we need the model to create a representation that tells us if two images are similar.
There are meaningful patterns in the data that are more abstract than individual pixels,
and finding them across many images is called Representation Learning.
These patterns help us understand what’s in the images and how to compare them to each
other.
Representation learning happens both in supervised and unsupervised learning models, so we can
do it with or without labels to find patterns in the world.
To understand the basic idea of representation learning, check out this experiment: I’m
gonna look at a picture really fast and then try to draw it.
Ready, Set, Go!
Woah. That was 5 seconds?
My eyes took in the picture and remembered important features, so I’m building a representation
in my mind.
But I can’t just show you my thoughts to get feedback on what parts I misremembered,
so I have to produce a reconstruction, or draw the original image from memory.
Alright, so this is what I’ve got.
Now let’s compare my drawing to the original image.
Let's see round plate, triangle slice of pizza, some cheese, some crust, tablecloth. Pretty good.
For an AI, making a reconstruction would mean producing all the right pixel values to make
a reconstruction.
Our K-means clustering algorithm from before, predicted classes for flowers based on how
close the datapoints were to the averages.
For images, we will have learned image representations instead of averages.
After that step, just like before, the AI will have to correct itself.
Previously, we updated the K clusters based on how well our predicted labels fit the data.
But for images, we’d have to update the model’s /internal representations/ based
on its reconstructions.
There are different ways to use unsupervised learning in combination with representation
learning so that an AI can compare images.
Like, for example, there’s a type of neural network called an autoencoder, which uses
the same basic principles of weights and biases to process inputs, pass data onto hidden neuron
layers, and finally to a prediction output layer.
If John-Green-bot was programmed with an autoencoder, the input would be an image, the hidden layers
would contain representations, and the output would be a full reconstruction of the original
image (which gets more accurate the more we train his AI).
Theoretically, I could give John-Green-bot a representation of a pizza and he could reconstruct
the original pizza image.
What’s so powerful about unsupervised learning is that the world is our teacher.
By looking around, taking in a lot of data, and predicting what we’ll see and hear next,
we learn about how the world works and how it should be represented.
When asked how AI will fulfill its grand ambitions, 2018
Turing Award Winner Professor Yann LeCun, said: “We all know that unsupervised learning
is the ultimate answer.“
So I guess we better keep working on it!
Unsupervised learning is a huge area of active research.
The human brain is specially designed for this kind of learning and has different parts
for vision, language, movement, and so on.
These structures and what kinds of patterns our brains look for were developed over billions
of years of evolution.
But it’s really tricky to build an AI that does unsupervised learning well because AI
systems can’t learn exactly like human often do, just by watching and imitating.
Someone, like us, has to design the models and tell them how to look for patterns before
letting them loose.
Next time, we’ll look at applying similar concepts to AI systems that find patterns
in words and language, in what’s called Natural Language Processing.
See you then!
Thanks to Wix for supporting PBS Digital Studios.
Checkout Wix.com if you’re looking to make your own website.
Wix is a platform that allows you to build a personalized website for almost any purpose
from promoting your business or creating an online shop to a place for you to test out
new ideas.
Their technology allows you to create something unique no matter your skill level with templates
and all in one management.
If you’d like to check it out you can go to wix.com/go/crashcourse
Or click the link in the description.
Crash Course AI is produced in association with PBS Digital Studios.
If you want to help keep Crash Course free for everyone, forever, you can join our
community on Patreon.
And if you want to learn more about the math of k-means clustering, check out this video
from Crash Course Statistics.